Core SDK
Installation
The core sdk, dynamofl
is available on pypi and you can install it using the following command
pip install dynamofl
Constructor
DynamoFL
DynamoFL(api_key, host?, metadata?)
Initializes the DynamoFL client instance connection. The DynamoFL object is your entrypoint to the rest of the DynamoFL SDK.
Your DynamoFL API key is required when calling this function, as it identifies your account to the DynamoFL server. To create an API key, navigate to the profile page on DynamoFL and generate a new access token.
Method Parameters
api_key | required string
API key attained from UI profile page.
host | _optional string**_
API server identifier, https://api.dynamo.ai by default.
metadata | _optional dict**_
Default metadata to apply to all datasources attached by the instance. Not required for penetration testing or model evaluation.
Returns
DynamoFL
object.
Example
# Initialize instance on default dynamofl server.
dfl = DynamoFL('56dfae3d-12aa-4148-988e-b71aebb8a75c')
# Initialize instance against a locally hosted server
dfl = DynamoFL('56dfae3d-12aa-4148-988e-b71aebb8a75c', host='http://localhost:3000')
Models - Local
create_model
create_model(name, architecture, architecture_hf_token?, model_file_path?, model_file_paths?, checkpoint_json_file_path?, peft_config_path?, key?)
Creates a new local model object by uploading a model file. The model object can then be used for running evaluations and tests.
Method Parameters
name | required string
Model name.
architecture | required string
HuggingFace hub id for the architecture to load the model file into into.
Example: "mistralai/mistral-7b-v0.1"
architecture_hf_token | optional string
HuggingFace token for the provided architecture. Required if the model is private or gated on the hub.
Example: "hf_***"
. Go to hf.co/settings/tokens to generate a new token.
model_file_path | required string
Path to file containing the model weights. Valid file extensions include .pt
, .bin
and .safetensors
.
Example: "my_model_path/pytorch_model.bin"
model_file_paths | optional list
List of paths to files containing sharded model weights. Valid file extensions include .pt
, .bin
and .safetensors
.
Example: ["my_model_path/pytorch_model-00001-of-00002.bin", "my_model_path/pytorch_model-00002-of-00002.bin"]
checkpoint_json_file_path | optional string
Path to json file containing sharded model configuration.
Example: "my_model_path/pytorch_model.bin.index.json"
peft_config_path | optional string
Path to json file containing adapter configuration, necessary if the provided model_file_path points to an adapter model file from PEFT library ("adapter_model.bin"
).
Example: "my_model_path/adapter_config.json"
key | optional string
Unique model identifier key. Will be autogenerated if not provided.
Returns
LocalModel
object.
Example
# Creating a model from a local file upload
model = dfl.create_model(
name="Sheared LLaMA",
architecture="princeton-nlp/Sheared-LLaMA-1.3B",
model_file_path="test_models/pytorch_model.bin",
)
# Creating a model from a local file upload fine-tuned with LoRA (PEFT)
model = dfl.create_model(
name="Sheared LLaMA w/ LoRA",
architecture="princeton-nlp/Sheared-LLaMA-1.3B",
model_file_path="test_models/adapter_model.bin",
peft_config_path="test_models/adapter_config.json",
)
create_hf_model
create_hf_model(name, hf_id, architecture_hf_id?, is_peft?, hf_token?, key?)
Creates a new local model object by providing it's HuggingFace hub id. The model object can then be used for running evaluations and tests.
Method Parameters
name | required string
Model name.
hf_id | required string
HuggingFace hub id for the model.
architecture_hf_id | optional string
HuggingFace hub id for the architecture. Required for PEFT adapter models.
is_peft | optional bool
Boolean to indicate whether the provided hf\_id
points to a PEFT adapter model.
hf_token | optional string
HuggingFace token for the provided model/architecture. Required if the model/architecture is private or gated on the hub.
Example: "hf_***"
. Go to hf.co/settings/tokens to generate a new token.
key | optional string
Unique model identifier key. Will be autogenerated if not provided.
Returns
LocalModel
object.
Example
model = dfl.create_hf_model(
name="Llama2 7b Chat",
hf_id="NousResearch/Llama-2-7b-chat-hf"
)
Models - Remote
create_openai_model
create_openai_model(name, api_key, api_instance, key?)
Creates a new remote openai model object. The model object can then be used for running evaluations and tests.
Method Parameters
name | required string
Model name.
api_key | required string
OpenAI API Key to use for the model.
api_instance | required string
Model identifier name from OpenAI.
Example: "gpt3.5-turbo-0125"
key | optional string
Unique model identifier key. Will be autogenerated if not provided.
Returns
RemoteModel
object.
Example
model = dfl.create_openai_model(
name="GPT 3.5 Model",
api_instance="gpt-3.5-turbo-0125",
api_key="sk-***"
)
create_azure_openai_model
create_azure_openai_model(name, api_key, api_instance, api_version, model_endpoint, key?)
Creates a new remote Azure OpenAI model object. The model object can then be used for running evaluations and tests.
Method Parameters
name | required string
Model name.
api_key | required string
Azure API Key to use for the model.
api_instance | required string
Model identifier name from Azure OpenAI.
Example: "gpt35-turbo"
api_version | required string
Azure OpenAI version to use.
Example: "2023-07-01-preview"
model_endpoint | required string
URL string representing model endpoint.
Example: "https://abc-azure-openai.openai.azure.com/"
key | optional string
Unique model identifier key. Will be autogenerated if not provided.
Returns
RemoteModel
object.
Example
model = dfl.create_azure_openai_model(
name="GPT 3.5 Model",
api_instance="gpt35-turbo",
api_key="***",
api_version="2023-07-01-preview",
model_endpoint="https://abc-azure-openai.openai.azure.com/"
)
create_databricks_model
create_databricks_model(name, api_key, model_endpoint, key?)
Creates a new remote Azure OpenAI model object. The model object can then be used for running evaluations and tests.
Method Parameters
name | required string
Model name.
api_key | required string
Databricks API Key to use for the model.
model_endpoint | required string
Databricks model endpoint.
Example: "<host>/serving-endpoints/<some-model-name>/invocations"
key | optional string
Unique model identifier key. Will be autogenerated if not provided.
Returns
RemoteModel
object.
Example
model = dfl.create_databricks_model(
name="Databricks Mixtral",
api_key="***",
model_endpoint="<host>/serving-endpoints/<mixtral-model-name>/invocations"
)
create_custom_model
create_custom_model(name, remote_model_endpoint, remote_api_auth_config, request_transformation_expression?, response_transformation_expression?, response_type?, batch_size?, multi_turn_support?, enable_retry?, key?)
Creates a new remote model object by providing its endpoint and authentication details. The model object can then be used for running evaluations and tests. This method is particularly useful for incorporating models that are hosted externally but need to be accessed via DynamoFL platform. For more details on integrating custom models, including request/response formats and JSONata transformations, see the Custom Language Model guide.
Method Parameters
name | required string
Model name.
remote_model_endpoint | required string
The endpoint URL where the remote model is hosted. This URL is used by DynamoFL to send requests to the model.
remote_api_auth_config | required dict
A dictionary containing the authentication configuration needed to connect to the remote model. To learn more about the supported authentication types, see the Supported Authentication Types guide.
request_transformation_expression | optional string
JSONata expression to transform incoming requests. This parameter is optional and only needed if the request format needs to be modified.
response_transformation_expression | optional string
JSONata expression to transform outgoing responses. This parameter is optional and only needed if the response format needs to be adjusted.
response_type | optional string (default: "string")
Defines the expected type of the response from the remote model. Valid types are "string" (default) and "boolean".
batch_size | optional int (default: 1)
Specifies the number of requests that can be sent to the remote model in a single batch. This is useful for models that can process multiple inputs at once for efficiency.
multi_turn_support | optional bool (default: True)
Indicates whether the model supports multi-turn interactions, such as in a conversational AI scenario.
enable_retry | optional bool (default: False)
Enables automatic retries of requests in case of failures. Useful for improving reliability in communication with the remote model.
key | optional string
Unique model identifier key. Will be autogenerated if not provided.
Returns
RemoteModelEntity
: An object representing the remote model integrated into DynamoFL.
Example
from dynamofl.entities import AuthTypeEnum
model = dfl.create_custom_model(
name="External AI Model",
remote_model_endpoint="https://api.externalmodel.com/chat/completions",
remote_api_auth_config={
"_type": AuthTypeEnum.BEARER,
"token": "your_api_token_here"
},
request_transformation_expression=None,
response_transformation_expression=None,
response_type="string",
batch_size=32,
multi_turn_support=False,
enable_retry=True
)
Models - Helpers
get_model
get_model(key)
Returns model object based on identifier key.
Method Parameters
key | required string
Unique model identifier key.
Returns
LocalModel
or RemoteModel
object.
Example
model = dfl.get_model('unique_identifier_key')
Datasets
create_dataset
create_dataset(file_path, name?, test_file_path?, key?)
Creates a new dataset object by uploading a dataset file.
Method Parameters
file_path | required string
Path to file containing dataset. Valid file extensions include .csv, .txt
.
test_file_path | optional string
Path to the file containing the test split of the dataset. Required for downstream membership inference tests. Valid file extensions include .csv, .txt
.
key | optional string
Unique dataset identifier key. Will be autogenerated if not provided.
name | optional string
Dataset name.
Returns
Dataset
object.
Example
dataset = dfl.create_dataset(
file_path="test_datasets/train.csv",
name="Fine-tuning dataset",
)
# with test file path
dataset = dfl.create_dataset(
file_path="data/train.csv",
test_file_path="data/test.csv",
name="Fine-tuning dataset",
)
create_hf_dataset
create_hf_dataset(name, hf_id, hf_token?, key?)
Creates a new dataset object that points to hosted dataset on HuggingFace hub.
Method Parameters
name | optional string
Dataset name.
hf_id | required string
HuggingFace hub id for the dataset. 'train'
and 'test'
splits are required downstream membership inference tests.
hf_token | optional string
HuggingFace token for the provided dataset id. Required if the dataset is private or gated on the hub.
key | optional string
Unique dataset identifier key. Will be autogenerated if not provided.
Returns
HFDataset
object.
Example
dataset = dfl.create_hf_dataset(
name="HF dataset",
hf_id="fka/awesome-chatgpt-prompts"
hf_token="hf_***",
)
Vector Databases
ChromaDB
ChromaDB(host, port, collection, ef_inputs)
Initializes a Chroma vector database connection. Required for tests on RAG workflows.
Method Parameters
host | required string
Host connection for vector database.
port | required int
Port for vector database connection.
collection | required string
Vector database collection name
ef_inputs | required object
Embedding function to be used for vector database.
HuggingFace
api_key | required string
HuggingFace Hub API key to access embedding function.
model_name | required string
HuggingFace Hub embedding function model name.
OpenAI
api_key | required string
OpenAI API key to access embedding function.
model_name | required string
OpenAI embedding function model name.
Azure OpenAI
api_key | required string
Azure OpenAI API key to access embedding function.
model_name | required string
Azure OpenAI embedding function model name.
api_base | required string
Azure OpenAI API base endpoint.
api_version | required string
Azure OpenAI API endpoint version.
Sentence Transformer
model_name | required string
Sentence Transformer embedding function model name.
Returns
ChromaDB
object.
Example
chroma_args = {
"host": "chroma-service.chroma.svc.cluster.local",
"port": 8000,
"collection": "my_collection",
"ef_inputs": {
"ef_type": "sentence_transformer",
"model_name": "my_embedding_function",
},
}
chroma_connection = ChromaDB(**chroma_args)
LlamaIndexDB
LlamaIndexDB(aws_key, aws_secret_key, s3_bucket_name, ef_inputs)
Initializes a LlamaIndex vector index connection. Required for tests on RAG workflows.
Method Parameters
aws_key | required string
AWS S3 access key for the persistent directory
aws_secret_key | required int
AWS S3 secret access key for the persistent directory
s3_bucket_name | required string
AWS S3 bucket name for the persistent directory
ef_inputs | required object
Embedding function to be used for vector database.
HuggingFace
api_key | required string
HuggingFace Hub API key to access embedding function.
model_name | required string
HuggingFace Hub embedding function model name.
OpenAI
api_key | required string
OpenAI API key to access embedding function.
model_name | required string
OpenAI embedding function model name.
Azure OpenAI
api_key | required string
Azure OpenAI API key to access embedding function.
model_name | required string
Azure OpenAI embedding function model name.
api_base | required string
Azure OpenAI API base endpoint.
api_version | required string
Azure OpenAI API endpoint version.
Sentence Transformer
model_name | required string
Sentence Transformer embedding function model name.
Returns
LlamaIndexDB
object.
Example
llamaindex_arg = {
"aws_key": AWS_KEY, # aws s3 access key
"aws_secret": AWS_SECRET_KEY, # aws s3 secret access key
"s3_bucket_name": "llamaindex-test", # aws s3 bucket name
"ef_inputs": {
"ef_type": "hf", # embedding function provider
"model_name": "all-MiniLM-L6-v2", # embedding function model name
"api_key": HF_AUTH_TOKEN, # credential needed to access the model
},
}
llamaindex_connection = LlamaIndexDB(**llamaindex_args)
CustomRagDB
CustomRagDB(custom_rag_application_id)
Initializes a CustomRagDB connection. Required for tests on RAG workflows.
Method Parameters
custom_rag_application_id | required int
Custom RAG application id
Returns
CustomRagDB
object.
Example
custom_rag_arg = {
"custom_rag_application_id": 12 # id of custom-rag-application
}
custom_rag_connection = CustomRagDB(**custom_rag_arg)
Tests - Privacy
create_membership_inference_test
create_membership_inference_test(name, model_key, dataset_id, gpu, input_column, prompts_column?, reference_column?, base_model?, pii_classes?, regex_expressions?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
input_column | required string
Input column in the dataset to use for membership inference.
prompts_column | optional string
Column to specify the prompts for the input. For encoder-decoder models only.
reference_column | optional string
Column to specify the reference for the input. For encoder-decoder models only.
base_model | optional string
Base model to use for the attack, can be a HuggingFace hub id.
pii_classes | optional List[string]
PII classes to attack. E.g PERSON.
regex_expressions | optional Dict[str, str]
list of regex expressions to use for extraction.
Returns
Test
object.
Example
test_info = dfl.create_membership_inference_test(
name="membership_inference_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
pii_classes=["PERSON", "ORG", "LOC", "DATE"],
regex_expressions={
"USERNAME": r"([a-zA-Z]+_[a-zA-Z0-9]+)",
"EMAIL": r"([a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,})",
"SSN": r"(\d{3}-\d{2}-\d{4})",
},
input_column="email_body",
reference_column="email_body",
base_model="gpt2",
grid=[
{
"temperature": [1.0, 0.5, 0.7],
}
],
)
create_pii_extraction_test
create_pii_extraction_test(name, model_key, dataset_id, gpu, pii_ref_column, prompts_column?, base_model?, pii_classes?, extraction_prompt?, sampling_rate?, regex_expressions?, grid?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
pii_ref_column | required string
Column in the dataset to sample prompts from.
prompts_column | optional string
Column to specify the prompts for the input. For encoder-decoder models only.
base_model | optional string
Optional str for the base model to use for the attack. Can be a HuggingFace hub id or an API instance name.
pii_classes | optional List[string]
PII classes to attack. E.g PERSON.
regex_expressions | optional Dict[str, str]
list of regex expressions to use for extraction.
extraction_prompt | optional string
Prompt for PII extraction. If provided, must be one of "", "dfl_dynamic", or "dfl_ata".
sampling_rate | optional float
Number of times to attempt generating candidates.
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack. Check How to use grid? for further details
Hyperparameters
Param | Type | Default | Description |
---|---|---|---|
temperature | float | 1.0 | Model temperature, controls model randomness, should be > 0 |
seq_len | int | 256 | Number of tokens to generate in model response, 256 by default |
Returns
Test
object.
Example
test_info = dfl.create_pii_extraction_test(
name="pii_extraction_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=VRAMConfig(vramGB=16),
sampling_rate=128,
pii_classes=["PERSON", "ORG"],
pii_ref_column="email_body",
grid=[
{
"seq_len": [256],
}
],
)
create_pii_inference_test
create_pii_inference_test(name, model_key, dataset_id, gpu, pii_ref_column, prompts_column?, base_model?, pii_classes?, regex_expressions?, num_targets?, candidate_size?, sample_and_shuffle?, grid?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
pii_ref_column | required string
Column in the dataset to sample prompts from.
prompts_column | optional string
Column to specify the prompts for the input. For encoder-decoder models only.
base_model | optional string
Optional str for the base model to use for the attack. Can be a HuggingFace hub id or an API instance name.
pii_classes | optional List[string]
PII classes to attack. E.g PERSON.
regex_expressions | optional Dict[str, str]
list of regex expressions to use for extraction.
num_targets | optional int
Number of target sequence to sample to attack.
candidate_size | optional int
Number of PII candidates to sample randomly for the attack.
sample_and_shuffle | optional int
Number of times to sample and shuffle candidates.
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack. Check How to use grid? for further details.
Hyperparameters
Param | Type | Default | Description |
---|---|---|---|
temperature | float | 1.0 | Model temperature, controls model randomness, should be > 0 |
seq_len | int | 256 | Number of tokens to generate in model response, 256 by default |
target_sequence_scope | str | entire_sample | determines the breadth of the sentence length for the model |
Returns
Test
object.
Example
test_info = dfl.create_pii_inference_test(
name="pii_inference_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=VRAMConfig(vramGB=16),
num_targets=32,
pii_classes=["PERSON", "ORG"],
pii_ref_column="email_body",
grid=[
{
"seq_len": [256],
}
],
)
create_pii_reconstruction_test
create_pii_reconstruction_test(name, model_key, dataset_id, gpu, pii_ref_column, prompts_column?, base_model?, pii_classes?, regex_expressions?, num_targets?, candidate_size?, sampling_rate?, grid?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
pii_ref_column | required string
Column in the dataset to sample prompts from.
prompts_column | optional string
Column to specify the prompts for the input.
base_model | optional string
Optional str for the base model to use for the attack. Can be a HuggingFace hub id or an API instance name.
pii_classes | optional List[string]
PII classes to attack. E.g PERSON.
regex_expressions | optional Dict[str, str]
list of regex expressions to use for extraction.
num_targets | optional int
Number of target sequence to sample to attack.
candidate_size | optional int
Number of PII candidates to sample randomly for the attack.
sampling_rate | optional float
Number of times to attempt generating candidates.
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack. Check How to use grid? for further details.
Hyperparameters
Param | Type | Default | Description |
---|---|---|---|
temperature | float | 1.0 | Model temperature, controls model randomness, should be > 0 |
seq_len | int | 256 | Number of tokens to generate in model response, 256 by default |
target_sequence_scope | str | entire_sample | determines the breadth of the sentence length for the model |
Returns
Test
object.
Example
test_info = dfl.create_pii_reconstruction_test(
name="pii_reconstruction_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
sampling_rate=128,
num_targets=32,
pii_classes=["PERSON", "ORG"],
pii_ref_column="email_body",
grid=[
{
"temperature": [0.5, 1],
"seq_len": [256],
}
],
)
create_sequence_extraction_test
create_sequence_extraction_test(name, model_key, dataset_id, gpu, memorization_granularity, sampling_rate, is_finetuned, base_model?, title?, title_column?, text_column?, source?, grid?)
Creates a sequence extraction test on a model with a dataset to evaluate memorization.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
memorization_granularity | required string
Granularity of memorization. E.g paragraph, sentence
sampling_rate | required string
Number of times to attempt generating candidates.
is_finetuned | required bool
Whether the model is finetuned or not; determines whether to generate the fine-tuned or the base model report
base_model | optional string
Base model to use for the attack, can be a HuggingFace hub id or an API instance name.
title | optional string
Title to use for the attack
title_column | optional string
Name of column containing the title of the document, if dataset is csv
text_column | optional string
Name of column containing the text of the document, if dataset is csv
source | optional string
Source of the dataset e.g. NYT
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
seq_len | int | Number of tokens to generate in model response, 256 by default |
prompt_length | int | Length of the prefix/suffix being used to prompt the model |
Returns
Test
object.
Example
test_info = dfl.create_sequence_extraction_test(
name="sequence_extraction_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
memorization_granularity="paragraph",
source="NYT",
grid=[
{
"temperature": [0],
"seq_len": [256],
"prompt_length": [40]
}
],
)
Tests - Performance and Hallucinations
create_hallucination_test
create_hallucination_test(name, model_key, dataset_id, gpu, hallucination_metrics, input_column, topic_list?, prompts_column?, reference_column, grid?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
hallucination_metrics | required List[string]
Hallucation metrics used. E.g nli-consistency, unieval-factuality
input_column | required string
Input column in the dataset to use for hallucination test
topic_list | optional List[string]
List of topics to cluster the result
prompts_column | optional string
Column to specify the prompts for the input
reference_column | optional string
Column to specify the reference for the input
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack. Check How to use grid? for further details
Hyperparameters
Param | Type | Default | Description |
---|---|---|---|
temperature | float | 1.0 | Model temperature, controls model randomness, should be > 0 |
seq_len | int | 256 | Number of tokens to generate in model response, 256 by default |
Returns
Test
object.
Example
test_info = dfl.create_hallucination_test(
name="hallucination_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
hallucination_metrics=["nli-consistency", "unieval-factuality"],
input_column="document",
grid=[
{
"temperature": [1.0, 0.7],
}
],
)
create_performance_test
create_performance_test(name, model_key, dataset_id, gpu, performance_metrics, input_column, topic_list?, prompts_column?, reference_column?, grid?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
performance_metrics | required List[string]
List of metrics to calculate for performance test, options include: [’rouge’, ‘bertscore’]
input_column | required string
Input column in the dataset to use for performance evaluation
topic_list | optional List[string]
List of topics to cluster the result
prompts_column | optional string
Column to specify the prompts for the input
reference_column | optional string
Column to specify the reference for the input
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Default | Description |
---|---|---|---|
temperature | float | 1.0 | Model temperature, controls model randomness, should be > 0 |
seq_len | int | 256 | Number of tokens to generate in model response, 256 by default |
How to use grid?
The combinations are generated across the parameters supplied in each dict present in the list.
Scenario 1 Input =>
grid=[{
"seq_len": [128],
"temperature": [0.5]
}]
Output => 1 Attack
{"seq_len": 128, "temperature": 0.5}
Scenario 2 Input =>
grid=[
{"seq_len": [128]},
{"temperature": [0.5]}
]
Output => 2 attacks with the following configs
Attack 1 -> {"seq_len": 128}
- Default value for temperatue is used in this attack
Attack 2 -> {"temperature": 0.5}
- Default value for seq_len is used in this attack
Returns
Test
object.
Example
test_info = dfl.create_performance_test(
name="performance_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
performance_metrics=["rouge", "bertscore"],
input_column="document",
grid=[
{
"temperature": [1.0, 0.5],
"seq_len": [64]
},
{
"seq_len": [64, 256]
}
],
)
create_rag_hallucination_test
create_rag_hallucination_test(name, model_key, dataset_id, gpu, rag_hallucination_metrics, input_column, example_column?, prompts_column?, prompt_template?, topic_list?, vector_db?, grid?)
Creates and orchestrates a new penetration test or evaluation.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
dataset_id | required string
Unique identifier of dataset object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
rag_hallucination_metrics | required List[string]
List of metrics to be calculated during RAG evaluation test, see below for valid options. For more details, please see the appendix.
- RAG Hallucination Metric Options
retrieval-relevance
| Evaluate the relevance of documents retrieved from the vectorDB using your embedding modelresponse-relevance
| Evaluate the relevance of model generated responses to input queriesfaithfulness
| Evaluate the faithfulness of model generated responses to the retrieved document context
input_column | required string
Input column in the dataset to use for rag_hallucination evaluation
example_column | optional string
Example column used for few shot examples
prompts_column | optional string
Column to specify the prompts for the input
prompt_template | optional string
Prompt template to use for the attack
topic_list | optional List[string]
List of topics to cluster the result
vector_db | required VectorDB
Vector database object to be used in RAG workflow. Supported vector database types:
ChromaDB
CustomRagDB
LlamaIndexDB
LlamaIndexWithChromaDB
PostgresVectorDB
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack. Check How to use grid? for further details
Hyperparameters
Param | Type | Default | Description |
---|---|---|---|
temperature | float | 1.0 | Model temperature, controls model randomness, should be > 0 |
seq_len | int | 256 | Number of tokens to generate in model response, 256 by default |
retrieve_top_k | int | 3 | Number of documents to be retrieved from vector database to be used for evaluation and provide at context. The recommended number is 3. Must be > 2 |
- Each dict of the outer element is the list of the combinations to try grid =
Returns
Test
object.
Example
prompt_template = """Answer the question based only on the following context: {context}\n\nQuestion: {question}\n"""
chroma_args = {
"host": "abc-host",
"port": 8000,
"collection": "multidoc2dial",
"ef_inputs": {
"ef_type": "sentence_transformer",
"model_name": "all-MiniLM-L6-v2",
},
}
test_info = dfl.create_rag_hallucination_test(
name="rag_hallucination_test{}".format(SLUG).format()
model_key=model.key,
dataset_id=dataset._id,
input_column="queries_pp",
gpu=VRAMConfig(vramGB=16),
rag_hallucination_metrics=["retrieval-relevance", "response-relevance", "faithfulness"],
topic_list=[
"driver license registration",
"student scholarship and financial aid",
"eligibility and benefits for disabled and registration process",
"social security services and retirement plans",
],
prompt_template=prompt_template,
vector_db=chroma_args,
grid=[
{
"temperature": [1.0],
"seq_len": [256],
"retrieve_top_k": [3],
}
],
)
Tests - Compliance and Security
create_cybersecurity_compliance_test
create_cybersecurity_compliance_test(name, model_key, gpu, base_model?, sampling_rate?, grid?)
Creates and orchestrates a new Cybersecurity Compliance test on a specified model
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
base_model | optional string
Base model to use for the attack, can be a HuggingFace hub id or an API instance name.
sampling_rate | optional int
Number of attack prompts we feed the model. Default is 50.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
Returns
Test
object.
Example
test_info = dfl.create_cybersecurity_compliance_test(
name="test{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)
create_static_jailbreak_test
create_static_jailbreak_test(name, model_key, gpu, fast_mode, dataset_id?, grid?)
Create a static jailbreak test on a model
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
dataset_id | optional str
Id of the dataset to be used. If not provided, the test will default to the v0 dataset, which is a small dataset with 50 prompts for testing purposes:
https://github.com/patrickrchao/JailbreakingLLMs/blob/main/data/harmful_behaviors_custom.csv
If using a custom dataset, ensure that the dataset has the following columns:
- "goal": the prompt
- "category": the category of the prompt
- "shortened_prompt": the goal column shortened to 1-2 words (used for encoding attack and ascii art attack)
- "gcg": the prompt that includes the gcg suffix
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
Returns
Test
object.
Example
test_info = dfl.create_static_jailbreak_test(
name="static_jailbreak_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.V100, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)
create_bias_toxicity_test
create_bias_toxicity_test(name, model_key, gpu, fast_mode, grid?)
Create a bias/toxicity test on a model
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
Returns
Test
object.
Example
test_info = dfl.create_bias_toxicity_test(
name="bias_toxicity_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.V100, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)
create_adaptive_jailbreak_test
create_adaptive_jailbreak_test(name, model_key, gpu, fast_mode, dataset_id?, grid?)
Create an adaptive jailbreak test on a model.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
gpu | required GPUSpecification
GPUSpecification object identifying GPU configurations for test.
dataset_id | optional str
ID of the dataset to be used. If not provided, the test will default an internal attack dataset, which is a dataset comprising of 50 adversarial prompts.
If using a custom dataset, ensure that the dataset has the following columns:
- "goal": the prompt
- "target": the target column
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
Returns
Test
object.
Example
test_info = dfl.create_adaptive_jailbreak_test(
name="create_adaptive_jailbreak_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.V100, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)
create_prompt_extraction_test
create_prompt_extraction_test(name, model_key, gpu?, grid?)
Creates a system prompt extraction test on a model to evaluate if the model leaks its system prompt.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
gpu | optional GPUSpecification
GPUSpecification object identifying GPU configurations for test. Required for local models, optional for remote models (defaults to A10G).
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
Returns
Test
object.
Example
test_info = dfl.create_prompt_extraction_test(
name="prompt_extraction_test_{}".format(SLUG).format(),
model_key=model.key,
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)
create_multilingual_jailbreak_test
create_multilingual_jailbreak_test(name, model_key, language, gpu?, grid?)
Creates a multilingual jailbreak test on a model to evaluate its safety across different languages.
Method Parameters
name | required string
Test identifier name.
model_key | required string
Unique identifier of model object that test will be run on.
language | required string
Language to test the model in. Currently supports Japanese (ja
).
gpu | optional GPUSpecification
GPUSpecification object identifying GPU configurations for test. Required for local models, optional for remote models (defaults to A10G).
grid | optional List[Dict[str, List[str | float | int]]]
Grid of hyperparameters supported for this attack
Hyperparameters
Param | Type | Description |
---|---|---|
temperature | float | Model temperature, controls model randomness, should be > 0 |
Returns
Test
object.
Example
test_info = dfl.create_multilingual_jailbreak_test(
name="multilingual_jailbreak_test_{}".format(SLUG).format(),
model_key=model.key,
language="ja", # for Japanese
gpu=GPUConfig(gpu_type=GPUType.A10G, gpu_count=1),
grid=[
{
"temperature": [0],
}
],
)
Tests - Helpers
get_attack_info
get_attack_info(attack_id)
Returns attack object status.
Method Parameters
attack_id | required string
Unique attack identifier.
Returns
Attack result JSON object.
Example
all_attacks = test_info.attacks
attack_ids = [attack["id"] for attack in all_attacks]
for attack in attack_ids:
attack_info = dfl.get_attack_info(attack)
# Example Response:
# {'id': '6566d2718cf68d15c393ff0d',
# 'status': 'COMPLETED',
# 'failureReason': None,
# 'response': {
# 'metrics': {
# 'precision': 0.023429541595925297,
# 'recall': 0.014047231270358305,
# 'pii_intersection_per_category': {'DATE': 57, 'ORG': 6, 'PERSON': 6},
# 'dataset_pii_per_category': {'ORG': 1848, 'EMAIL': 494, 'USERNAME': 1130, 'DATE': 518, 'PERSON': 922},
# 'dataset_pii_category_count': 5,
# 'dataset_top_3_categories': ['ORG', 'USERNAME', 'PERSON'],
# 'extracted_pii_per_category': {'DATE': 568, 'EMAIL': 424, 'USERNAME': 1120, 'PERSON': 721, 'ORG': 112},
# 'samples': [{'prompt': '', 'response': "..."}, {...}],
# 'model_type': 'decoder'
# },
# 'inferences_location': 's3://dynamofl-pentest-prod/attacks/output/naive_extraction_1701238142.json',
# 'resolved_args': {'attack_args': {...}
# }
# }
# 'testId': '6566d2718cf68d15c393ff05'
# }
Custom RAG Adapter
DynamoEval provides the following APIs to manage custom RAG applications:
create_custom_rag_application
Creates and registers a new Custom RAG Application with specified configurations.
Method Parameters
base_url | required string
The base URL for the RAG application.
auth_type | required AuthTypeEnum
Authentication type (AuthTypeEnum.NO_AUTH, AuthTypeEnum.BEARER).
auth_config | optional dict
Authentication configuration parameters.
custom_rag_application_routes | optional list
List of route configurations.
Returns
CustomRagApplicationResponseEntity
object.
You can create a Custom RAG Application in several ways:
- Basic creation without authentication:
from dynamofl.entities import AuthTypeEnum
custom_rag_app = dfl.create_custom_rag_application(
base_url="https://api.example.com",
auth_type=AuthTypeEnum.NO_AUTH
)
- Creation with authentication configuration:
from dynamofl.entities import AuthTypeEnum
custom_rag_app = dfl.create_custom_rag_application(
base_url="https://api.example.com",
auth_type=AuthTypeEnum.BEARER,
auth_config={
"token": "bearer-token"
}
)
- Creation along with route configuration:
from dynamofl.entities import CustomRagApplicationRoutesEntity, AuthTypeEnum, RouteTypeEnum
custom_rag_application_routes = [
CustomRagApplicationRoutesEntity(
route_type=RouteTypeEnum.RETRIEVE,
route_path="/retrieve",
request_transformation_expression=None,
response_transformation_expression=None,
)
]
custom_rag_app = dfl.create_custom_rag_application(
base_url="https://api.example.com",
auth_type=AuthTypeEnum.NO_AUTH,
custom_rag_application_routes=custom_rag_application_routes
)
update_custom_rag_application
Updates an existing Custom RAG Application's base configuration.
Method Parameters
custom_rag_application_id | required int
The unique identifier of the RAG application to update.
base_url | required string
The new base URL for the RAG application.
auth_type | required AuthTypeEnum
Authentication type (AuthTypeEnum.NO_AUTH, AuthTypeEnum.BEARER).
auth_config | optional dict
The new authentication configuration parameters.
Returns: CustomRagApplicationResponseEntity
from dynamofl.entities import AuthTypeEnum
updated_app = dfl.update_custom_rag_application(
custom_rag_application_id=123,
base_url="https://api.example.com",
auth_type=AuthTypeEnum.BEARER,
auth_config={"token": "bearer-token"}
)
get_all_custom_rag_applications
Retrieves all registered Custom RAG Applications.
Method Parameters
include_routes | optional bool
Whether to include route details in the response. Defaults to False.
Returns: AllCustomRagApplicationResponseEntity
all_apps = dfl.get_all_custom_rag_applications(include_routes=True)
get_custom_rag_application
Retrieves details of a specific Custom RAG Application.
Method Parameters
custom_rag_application_id | required int
The unique identifier of the RAG application to retrieve.
include_routes | optional bool
Whether to include route details in the response. Defaults to False.
Returns: List[CustomRagApplicationResponseEntity]
app = dfl.get_custom_rag_application(
custom_rag_application_id=123,
include_routes=True
)
delete_custom_rag_application
Removes a Custom RAG Application from the system.
Method Parameters
custom_rag_application_id | required int
The unique identifier of the RAG application to delete.
Returns: None
dfl.delete_custom_rag_application(custom_rag_application_id=123)
create_route
Adds a new route to an existing Custom RAG Application.
Method Parameters
custom_rag_application_id | required int
The ID of the RAG application to which the route belongs.
route_type | required RouteTypeEnum
Route type (RouteTypeEnum.RETRIEVE).
route_path | required string
The URL path defining the route.
request_transformation_expression | optional string
JSONata expression to transform incoming requests.
response_transformation_expression | optional string
JSONata expression to transform outgoing responses.
Returns: List[CustomRagApplicationRoutesResponseEntity]
from dynamofl.entities import RouteTypeEnum
new_route = dfl.create_custom_rag_application_route(
custom_rag_application_id=123,
route_type=RouteTypeEnum.RETRIEVE,
route_path="/retrieve",
request_transformation_expression="...",
response_transformation_expression="..."
)
update_route
Updates an existing route in a Custom RAG Application.
Method Parameters
custom_rag_application_id | required int
The ID of the RAG application to which the route belongs.
route_id | required int
The unique identifier of the route to update.
route_type | required RouteTypeEnum
Route type (RouteTypeEnum.RETRIEVE).
route_path | required string
The new URL path for the route.
request_transformation_expression | optional string
JSONata expression to transform incoming requests.
response_transformation_expression | optional string
JSONata expression to transform outgoing responses.
Returns: CustomRagApplicationRoutesResponseEntity
updated_route = dfl.update_custom_rag_application_route(
custom_rag_application_id=123,
route_id=456,
route_type=RouteTypeEnum.RETRIEVE,
route_path="/new-search",
request_transformation_expression="...",
response_transformation_expression="..."
)
delete_route
Removes a specific route from a Custom RAG Application.
Method Parameters
custom_rag_application_id | required int
The ID of the RAG application from which to delete the route.
route_id | required int
The unique identifier of the route to delete.
Returns: None
dfl.delete_custom_rag_application_route(
custom_rag_application_id=123,
route_id=456
)